- sampling
- A method for collecting information and drawing inferences about a larger population or universe, from the analysis of only part thereof, the sample. Censuses of the population are an expensive way of monitoring social and economic change, and are carried out infrequently, usually at ten-year intervals. Sampling allows surveys of the complete population of a country, or sub-sections of it, to be carried out far more cheaply and frequently, and with resources devoted to improving the depth and quality of the information collected, in contrast with the shallow information obtainable from censuses. Sampling is also used in other contexts-for example as quality control in manufacturing industry. Within social science its use as the basis of sampling methodology and inferential statistics have contributed enormous improvements in the cost-effectiveness of empirical research.Probability sampling requires that each case in the universe being studied must have a determinate, or fixed, chance of being selected; probability statistics can then be used to measure quantitatively the risk of drawing the wrong conclusion from samples of various sizes. It seems intuitively obvious that if one in two cases is randomly selected from a population, the risk of the half so selected being unrepresentative of the whole group is far lower than if one in fifty were selected. The higher sampling fraction of one in two must give more reliable information than the sampling fraction of one case in fifty. But the actual sample size is even more important in determining how representative the sample is. A sample of about 2,500 persons has broadly the same reliability and representativeness, whether it comes from a population of 100,000 persons or one million persons. Samples of 2,000-2,500 are in fact the most common size for national samples, especially when a fairly narrow range of characteristics are being studied.There are a variety of sample designs. A random sample, or simple random sample, is one in which each case has an equal chance (or equal probability) of selection, so that the techniques of probability statistics can then be applied to the resulting information. A common variation on this is the stratified random sample: the population being studied is first divided into sub-groups or strata, and random sampling is applied within the strata. For example, random sampling might be applied to both the male and female groups of a population of political representatives, but using a sampling fraction of one person in twenty from the numerous male group and a sampling fraction of one person in two from the relatively small female group. Another common variation is two-stage or multi-stage (also known as complex) sampling. For example, random sampling is first used to select a limited number of local areas for a survey, and then a second stage of random sampling is applied to selecting persons or households or companies within the random sample of areas. The two stages can be extended to three or more stages, if necessary, so long as the eventual sample remains large enough to support analysis. All these sampling designs use random sampling in the final selection process, producing a list of persons from the electoral register, household addresses, company names, or other cases which constitute the final issued sample. All of them must be included in the study, with no substitutions allowed, in sharp contrast with the procedures for obtaining quota samples . For this reason, interviewers working on a random sample survey will exert great effort to persuade potential respondents to participate in the study. Failure to achieve interviews with the complete sample can produce non-response bias in the resulting data. The calculation of sampling errors for complex sample designs is statistically far more complicated than in the case of simple random samples.Once the sampling fraction and sample size are known, probability theory provides the basis for a whole range of statistical inferences to be made about the characteristics of the universe, from the observed characteristics of the sample drawn from it. The standard deviation (see variation ) of the distribution of sample means, which is referred to as the standard error of the means for any given characteristic (such as age), can be calculated to assess the reliability of the sample data. Large standard errors reduce our confidence that the sample is fully representative of the complete universe. Similarly, the probability of two samples yielding different measures, and the probability of obtaining particular values of correlation coefficient or other measures of association , can all be calculated. Most of the relevant calculations and significance tests are supplied in the SPSS software package. Statistics textbooks supply details of the underlying calculations.It must be emphasized that textbooks on sampling and probability statistics are written by statisticians, and refer exclusively to the case of the single random sample on a topic on which the statistician or researcher is entirely ignorant, having absolutely no substantive information other than that supplied by the sample. Deductions and inferences are therefore restricted to those that can be calculated mathematically. It is rare for a sociologist or other social scientist to be in this position. Good researchers bring a great deal of substantive knowledge to bear on assessing the validity and reliability of survey results, and they supplement statistical measures with other methods for increasing confidence in the reliability of sample survey results, and the interpretations placed on them. These methods include triangulation ; repeat surveys (as illustrated by opinion polls ); literature surveys which yield information on earlier replications ; as well as theoretical assessments. Statistical measures of reliability, association, or significance are not the same as assessments of the substantive importance of a result. Social surveys can sometimes be over-engineered, in seeking (for example) to establish whether the exact incidence of something is 31 per cent or 36 per cent, whereas in practice all that matters is whether it is about one-third or about one in thirty.

*Dictionary of sociology.
2013.*